| timing | topic |
|---|---|
| 15 | Organising your data for efficient plot descriptions |
| 15 | Grammatical descriptions for plots |
| 30 | Cognitive perception principles |
| 15 | Polishing your plots |
| 30 | Adding interactivity |
| timing | topic |
|---|---|
| 15 | Organising your data for efficient plot descriptions |
| 15 | Grammatical descriptions for plots |
| 30 | Cognitive perception principles |
| 15 | Polishing your plots |
| 30 | Adding interactivity |
What are the variables? WHO Tuberculosis Notifications
Rows: 16
Columns: 22
$ year <dbl> 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
$ new_sp <dbl> 226, 203, 285, 251, 228, 210, 113, 285, 241, 269, 281, 299, 267, 274, 301, 290
$ new_sp_m04 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 0, NA, 0, 0, 0, 2
$ new_sp_m514 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 3, NA, 3, 2, 2, 1
$ new_sp_m014 <dbl> 1, 0, 0, 3, 1, 1, 0, 0, 0, 1, 3, 2, 3, 2, 2, 3
$ new_sp_m1524 <dbl> 8, 11, 13, 16, 23, 15, 14, 18, 32, 33, 30, 46, 30, 42, 38, 26
$ new_sp_m2534 <dbl> 24, 22, 40, 35, 20, 20, 10, 16, 27, 35, 33, 33, 37, 33, 44, 40
$ new_sp_m3544 <dbl> 18, 18, 54, 25, 18, 26, 2, 17, 23, 23, 20, 20, 16, 22, 26, 17
$ new_sp_m4554 <dbl> 13, 13, 52, 24, 18, 19, 11, 15, 11, 21, 15, 27, 24, 25, 19, 25
$ new_sp_m5564 <dbl> 17, 15, 37, 19, 13, 13, 5, 11, 12, 16, 14, 23, 12, 9, 12, 16
$ new_sp_m65 <dbl> 28, 31, 49, 49, 35, 34, 30, 32, 30, 43, 37, 42, 34, 27, 37, 37
$ new_sp_mu <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0
$ new_sp_f04 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 0, NA, 1, 1, 2, 0
$ new_sp_f514 <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, 4, NA, 3, 3, 1, 1
$ new_sp_f014 <dbl> 0, 2, 0, 0, 1, 0, 0, 0, 2, 2, 4, 3, 4, 4, 3, 1
$ new_sp_f1524 <dbl> 10, 19, 10, 15, 21, 15, 9, 6, 18, 18, 26, 27, 31, 36, 26, 27
$ new_sp_f2534 <dbl> 15, 24, 16, 19, 27, 21, 13, 17, 26, 27, 37, 32, 27, 43, 40, 48
$ new_sp_f3544 <dbl> 9, 15, 18, 12, 16, 15, 3, 5, 11, 14, 20, 14, 14, 12, 23, 15
$ new_sp_f4554 <dbl> 5, 8, 6, 15, 7, 6, 5, 7, 10, 7, 12, 6, 12, 2, 7, 11
$ new_sp_f5564 <dbl> 10, 2, 2, 5, 8, 4, 4, 3, 6, 9, 7, 11, 11, 5, 7, 9
$ new_sp_f65 <dbl> 12, 24, 26, 14, 20, 23, 7, 19, 14, 21, 23, 10, 12, 12, 17, 15
$ new_sp_fu <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0, 0
* Import the CSV file
import delimited "data/TB_notifications_2023-08-21.csv", clear
* Filter for Australia and years after 1996
keep if country == "Australia" & year > 1996 & year < 2013
* Keep only the year and variables containing "new_sp"
ds year *new_sp*, has(varl)
keep `r(varlist)'
* Display the structure of the data
describe
* Show the first few observations
list in 1/10 Illustrations from Julia Lowndes and Allison Horst
Each variable is a column; each column is a variable.
Each observation is a row; each row is an observation.
Each value is a cell; each cell is a single value.
Each table contains one data set.
Long form makes it easier to reshape in many different ways
Wider forms are common for analysis
Long form: one measured value per row. All other variables are descriptors (key variables)
Widest form: all measured values for an entity are in a single row.
Steps to wrangle to tidy form:
agesex column)Is count a variable?
# A tibble: 12 × 4
year sex age count
<dbl> <chr> <fct> <dbl>
1 1997 m 0-14 1
2 1997 m 15-24 8
3 1997 m 25-34 24
4 1997 m 35-44 18
5 1997 m 45-54 13
6 1997 m 55-64 17
7 1997 m > 65 28
8 1997 f 0-14 0
9 1997 f 15-24 10
10 1997 f 25-34 15
11 1997 f 35-44 9
12 1997 f 45-54 5
tb_tidy <- tb |>
select(-new_sp, -new_sp_m04, -new_sp_m514,
-new_sp_f04, -new_sp_f514) |>
pivot_longer(starts_with("new_sp"),
names_to = "sexage",
values_to = "count") |>
mutate(sexage = str_remove(sexage, "new_sp_")) |>
separate_wider_position(
sexage,
widths = c(sex = 1, age = 4),
too_few = "align_start"
) |>
filter(age != "u") |>
mutate(age = fct_recode(age, "0-14" = "014",
"15-24" = "1524",
"15-24" = "1524",
"25-34" = "2534",
"35-44" = "3544",
"45-54" = "4554",
"55-64" = "5564",
"> 65" = "65"))
tb_tidy |> slice_head(n=12)* Drop specified variables
drop new_sp new_sp_m04 new_sp_m514 new_sp_f04 new_sp_f514
* Reshape data from wide to long format
reshape long new_sp_, i(year) j(sexage) string
* Rename reshaped variable
rename new_sp_ count
* Remove "new_sp_" prefix from sexage
replace sexage = subinstr(sexage, "new_sp_", "", .)
* Separate sexage into sex and age
gen sex = substr(sexage, 1, 1)
gen age = substr(sexage, 2, .)
* Drop original sexage variable
drop sexage
* Recode age variable
replace age = "0-14" if age == "014"
replace age = "15-24" if age == "1524"
replace age = "25-34" if age == "2534"
replace age = "35-44" if age == "3544"
replace age = "45-54" if age == "4554"
replace age = "55-64" if age == "5564"
replace age = "> 65" if age == "65"
replace age = "unknown" if age == "u"
* Convert age to a labeled factor variable
encode age, gen(age_factor)
* List the first few observations to check the result
list in 1/10Data on World Development Indicators (WDI) from World Bank.
Rows: 4,793
Columns: 23
$ `Country Name` <chr> "Afghanistan", "Afghanistan",…
$ `Country Code` <chr> "AFG", "AFG", "AFG", "AFG", "…
$ `Series Name` <chr> "Access to clean fuels and te…
$ `Series Code` <chr> "EG.CFT.ACCS.ZS", "EG.CFT.ACC…
$ `2004 [YR2004]` <chr> "10.5", "1.9", "45.3", "NA", …
$ `2005 [YR2005]` <chr> "11.9", "2.4", "50.2", "NA", …
$ `2006 [YR2006]` <chr> "13.5", "3", "54.7", "NA", "1…
$ `2007 [YR2007]` <chr> "15.1", "3.6", "59.2", "NA", …
$ `2008 [YR2008]` <chr> "16.6", "4.3", "62.9", "NA", …
$ `2009 [YR2009]` <chr> "18.3", "5.1", "66.4", "NA", …
$ `2010 [YR2010]` <chr> "19.9", "5.9", "69.4", "NA", …
$ `2011 [YR2011]` <chr> "21.3", "7", "72", "NA", "2.5…
$ `2012 [YR2012]` <chr> "22.9", "8", "74.3", "NA", "2…
$ `2013 [YR2013]` <chr> "24.5", "9", "76.1", "NA", "3…
$ `2014 [YR2014]` <chr> "26.1", "10.2", "78", "NA", "…
$ `2015 [YR2015]` <chr> "27.6", "11.4", "79.5", "NA",…
$ `2016 [YR2016]` <chr> "28.8", "12.6", "80.5", "NA",…
$ `2017 [YR2017]` <chr> "30.3", "13.5", "81.6", "NA",…
$ `2018 [YR2018]` <chr> "31.4", "14.5", "82.6", "NA",…
$ `2019 [YR2019]` <chr> "32.6", "15.6", "83.2", "NA",…
$ `2020 [YR2020]` <chr> "33.8", "16.4", "83.8", "NA",…
$ `2021 [YR2021]` <chr> "34.9", "17.4", "84.5", "NA",…
$ `2022 [YR2022]` <chr> "36.1", "18.5", "85", "NA", "…
Melbourne weather data from NOAA.
Rows: 1,593
Columns: 128
$ V1 <chr> "ASN00086282", "ASN00086282", "ASN000862…
$ V2 <int> 1970, 1970, 1970, 1970, 1970, 1970, 1970…
$ V3 <int> 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 1…
$ V4 <chr> "TMAX", "TMIN", "PRCP", "TMAX", "TMIN", …
$ V5 <int> 141, 80, 3, 145, 50, 0, 168, 19, 0, 189,…
$ V6 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V7 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V8 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V9 <int> 124, 63, 30, 128, 61, 66, 168, 29, 0, 19…
$ V10 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V11 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V12 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V13 <int> 113, 36, 0, 150, 75, 0, 162, 62, 0, 204,…
$ V14 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V15 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V16 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V17 <int> 123, 57, 0, 122, 67, 53, 162, 81, 0, 267…
$ V18 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V19 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V20 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V21 <int> 148, 69, 36, 109, 41, 13, 162, 81, 3, 25…
$ V22 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V23 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V24 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V25 <int> 149, 47, 3, 112, 51, 3, 150, 55, 5, 228,…
$ V26 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V27 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V28 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V29 <int> 139, 84, 0, 116, 48, 8, 184, 73, 0, 237,…
$ V30 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V31 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V32 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V33 <int> 153, 78, 0, 142, -7, 0, 179, 97, 38, 144…
$ V34 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V35 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V36 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V37 <int> 123, 49, 10, 166, 56, 0, 109, 72, 43, 16…
$ V38 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V39 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V40 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V41 <int> 108, 42, 23, 127, 62, 0, 125, 16, 18, 19…
$ V42 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V43 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V44 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V45 <int> 119, 48, 3, 117, 47, 3, 118, 46, 10, 233…
$ V46 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V47 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V48 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V49 <int> 112, 56, 0, 127, 33, 5, 143, 72, 0, 178,…
$ V50 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V51 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V52 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V53 <int> 126, 51, 5, 159, 67, 0, 149, 70, 18, 179…
$ V54 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V55 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V56 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V57 <int> 112, 36, 0, 143, 84, 0, 155, 76, 0, 137,…
$ V58 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V59 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V60 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V61 <int> 115, 44, 0, 114, 11, 64, 118, 52, 53, 17…
$ V62 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V63 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V64 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V65 <int> 133, 39, 0, 65, 41, 3, 141, 34, 13, 209,…
$ V66 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V67 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V68 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V69 <int> 134, 40, 0, 113, 18, 99, 152, 67, 0, 192…
$ V70 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V71 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V72 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V73 <int> 126, 58, 0, 125, 50, 36, 118, 51, 8, 204…
$ V74 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V75 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V76 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V77 <int> 104, 15, 8, 129, 22, 8, 122, 29, 3, 189,…
$ V78 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V79 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V80 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V81 <int> 143, 33, 0, 147, 28, 0, 156, -11, 3, 145…
$ V82 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V83 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V84 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V85 <int> 141, 51, 18, 161, 74, 0, 155, 24, 0, 188…
$ V86 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V87 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V88 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V89 <int> 134, 74, 0, 168, 94, 0, 128, 82, 150, 15…
$ V90 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V91 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V92 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V93 <int> 117, 39, 0, 178, 73, 8, 104, 85, 66, 168…
$ V94 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V95 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V96 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V97 <int> 142, 66, 0, 161, 88, 36, 123, 49, 69, 11…
$ V98 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V99 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V100 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V101 <int> 158, 78, 0, 145, 50, 25, 120, -10, 0, 14…
$ V102 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V103 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V104 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V105 <int> 149, 36, 13, 142, 48, 30, 145, -6, 0, 21…
$ V106 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V107 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V108 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V109 <int> 133, 61, 3, 137, 54, 56, 153, 39, 0, 241…
$ V110 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V111 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V112 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V113 <int> 143, 46, 0, 150, 78, 5, 175, 69, 5, 221,…
$ V114 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V115 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V116 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V117 <int> 150, 42, 25, 120, 47, 69, 150, 45, 0, 13…
$ V118 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V119 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V120 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V121 <int> 145, 63, 0, 114, 18, 3, 178, 23, 0, 161,…
$ V122 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V123 <chr> " ", " ", " ", " ", " ", " ", " ", " ", …
$ V124 <chr> "a", "a", "a", "a", "a", "a", "a", "a", …
$ V125 <int> 115, 39, 3, 129, 39, 20, -9999, -9999, -…
$ V126 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V127 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ V128 <chr> "a", "a", "a", "a", "a", "a", " ", " ", …
Illustrations from Julia Lowndes and Allison Horst
Tidy data is the starting point for statistical analysis, and data visualisation.
Read more from tidy paper and wrangling paper.
\[\begin{align} X = \left[ \begin{array}{cccc} x_{11} & x_{12} & \dots & x_{1p} \\ x_{21} & x_{22} & \dots & x_{2p} \\ \vdots & \vdots & \ddots & \vdots \\ x_{np} & x_{n2} & \dots & x_{np} \end{array} \right] \end{align}\]
Variables \(x_1, x_2, ..., x_p\) are in the columns. And we have \(n\) observations.
Graphics built on tidy data, fit nicely with your statistical analysis too.
A grammar of graphics maps the variables from a tidy data set to elements of the plot.
It’s like having the DNA rather than a species name, so you know how the plots are related to each other.
Same script can be applied to different data.
plot(data = <DATA>) +
<GEOM_FUNCTION>(
mapping = aes(<MAPPINGS>),
stat = <STAT>,
position = <POSITION>
) +
<COORDINATE_FUNCTION> +
<FACET_FUNCTION> +
<SCALE> +
<THEME>
* Collapse data to get yearly totals
collapse (sum) count, by(year)
* Generate column plot
graph bar (asis) count, over(year) ///
title("TB Cases by Year") ///
ytitle("Count") ///
name(g1, replace)
* Generate scatter plot with smoothed line
twoway (scatter count year) ///
(lowess count year), ///
title("TB Cases by Year") ///
ytitle("Count") ///
name(g2, replace)
* Combine the two graphs vertically
graph combine g1 g2, col(1) ysize(10)
MAPPING: x=year, y=prop, colour=country
FACET: age
GEOM: point, lm
❌
# A tibble: 10 × 4
year age m f
<dbl> <fct> <dbl> <dbl>
1 1997 0-14 1 0
2 1997 15-24 8 10
3 1997 25-34 24 15
4 1997 35-44 18 9
5 1997 45-54 13 5
6 1997 55-64 17 10
7 1997 > 65 28 12
8 1998 0-14 0 2
9 1998 15-24 11 19
10 1998 25-34 22 24
✅
# A tibble: 10 × 4
year sex age count
<dbl> <chr> <fct> <dbl>
1 1997 m 0-14 1
2 1997 m 15-24 8
3 1997 m 25-34 24
4 1997 m 35-44 18
5 1997 m 45-54 13
6 1997 m 55-64 17
7 1997 m > 65 28
8 1997 f 0-14 0
9 1997 f 15-24 10
10 1997 f 25-34 15
Doesn’t really do mappings nicely
* Encode the sex variable if it's not already numeric
encode sex, generate(sex_num)
* Create a custom color scheme
colorpalette tableau, nograph
local colors `r(p)'
* Create the scatter plot
twoway (scatter count year if sex == "m", mcolor("`r(p1)'") msymbol(O)) ///
(scatter count year if sex == "f", mcolor("`r(p2)'") msymbol(O)), ///
legend(order(1 "Male" 2 "Female")) ///
title("TB Cases by Year and Sex") ///
xtitle("Year") ytitle("Number of Cases")Cleveland and McGill (1984)
Illustrations made by Emi Tanaka
Based on the accuracy with which readers returned the numerical values.
Primary mapping used in common plots
Place elements that you want to compare close to each other. If there are multiple comparisons to make, you need to decide which one is most important.
Making comparisons across plots requires the eye to jump from one focal point to another. It may result in not noticing differences.
Can you find the odd one out?
Is it easier now?
There are three basic choices of palettes:
Which one you choose depends on the
Resources for exploring color:
Example from the fable package. See unfinished palette work here.
❌ Jet rainbow palette
Produces false detail, banding and color blindness ambiguity.
✅ viridis palettes
Have a uniform scale, match grey scale ladder. The turbo palette alleviates Jet rainbow palette problems.
❌ Jet rainbow palette
Produces false detail, banding and ambiguity.
✅ viridis palettes
Colors still readable and following scale.
If the variable mapped to colour has a right-skewed distribution, consider transforming it using a log or a square root.
This is the same data, where count has been transformed using square root.
Famous example: trade between England and the East Indies in the 18th century
Where is the biggest difference?
Let’s play a game!
Which plot wore it better?
For the question
Which country is managing TB best?
Take the following plot, and make it more difficult to read.
Think about what is it you learn from the plot, and how
might change what you learn.
What changes would you make for it to be easier to read?
The BBC cookbook has good basic advice. The work of Amanda Cox has been instrumental in the NY Times data visualisations.
Elements that are important in plot design are many.
Aspects assist going beyond barriers:
Benefits of html format:
Script the check makes it part of your workflow, e.g. colorspace R package.
clrs <- divergingx_hcl(palette="Zissou 1", n=7)
clrs <- deutan(divergingx_hcl(palette="Zissou 1", n=7))
Some sites allow manual upload of images.
fig-alt: "Three hexagon binned plots. The plot on the left is relatively uniform in colour, and looks like a disk, and the plot on the right has a high concentration of pink hexagons in the center, and rings of green and navy blue around the outside. The middle plot is in between the two patterns."
This is a lovely example. A lot more work is needed.
Simple efforts like marking slide progression with sound.
More examples to come.
GUIs provide explicit control over a small range of interactions.